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ABSTRACT 

We develop an estimator for the correlation function which, in the ensemble average, returns the 
shape of the correlation function, even for signals that have significant correlations on the scale of the 
survey region. Our estimator is general and works in any number of dimensions. We develop versions 
of the estimator for both diffuse and discrete signals. As an application, we examine Monte Carlo 
simulations of X-ray background measurements. These include a realistic, spatially-inhomogeneous 
population of spurious detector events. We discuss applying the estimator to the averaging of corre- 
lation functions evaluated on several small fields, and to other cosmological applications. 
Subject headings: cosmology: theory — methods: numerical — methods: data analysis — methods: 
statistical — X-rays: diffuse background — galaxies: clustering 
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1. INTRODUCTION 

Two-point statistics encode valuable information about the fields that they describe, such as the cosmological matter 
density traced by galaxies or the intensity of radiation in backgrounds like the Cosmic Microwave Background (CMB), 
the Cosmic Infrared Background (CIB), or the Diffuse X-ray Background (DXB). 

For discrete objects, the two-point, dimensionless correlation function can be defin ed in terms of the probability of 
finding a pair of objects in two small cells, with sizes Sili and Sn 2 , separated by 12 ( |Peebles|[l980l §31, 45): 



SP12 =N 2 5ili8il 2 [l + w(9i 2 )} (1) 
where Af is the mean density of sources. For diffuse fields, the equivalent definition for a signal s with mean (s) = fj, is 



(sis 2 ) =[i 2 [l + w(9 12 )} 



(2) 



where here and throughout (...) denotes the ensemble averagej^] We denote the covariance of s as C(6) = fj, 2 w(9), 
which we also refer to as the (dimensionful) correlation function. This work mostly deals with the dimensionful 
correlation function and addresses the bias in its estimation. With similar expressions, we can define correlation 
functions in any number of dimensions, replacing the angular separation by a linear separation or time interval or 
whatever is appropriate. 

The estimat i on of the correla t ion function has been studi ed extensively in the literature. For galaxy clustering, 
Hewett (1982), Davis & Peebles ( 1983J ), and [Hamilton (fl993| sugg e st diff erent Monte Carlo estimators, but the most 
common estimator now m use was advocated by Landy & Szalay (19931, which employs the data in concert with a 
synthetic, random catalog. Their estimator combines counts of objects pairs within and between the data and random 
catalogs. This estimator is biased, but for surveys where the correlation length of the objects is much smaller than 
the survey area, the bias is small (|Bernstein||1994 ). Such is the case for modern galaxy surveys like 2dF and SDSS 
rtPercivalet al.| |2001| [York et al.||20UU|. However, the )ias can become significant wh en structures approach the size 
of the survey (|Kerscher|| l999| ). This bias can be corrected (e.g. Scranton et al. 2002), but the correction depends on 



same correlation function that is being estimated. 

For diffuse signals like the CMB ; where using the dim ensionful correlation function is more common, a typical 
estimator looks like ( |Hinshaw et al.|[l996 Copi et al~||2007 ): 



Co(0) 



T,ij a i a j( s i- aOOj -A) 

Ey a i a 3 



(3) 



where on are the weights applied to the pixels or cells (for the purpose of downweighting noisy regions) , fl is an estimate 
of the mean, and the sum over ij refers to pixels separated by 9. These estimators suffer the same biases on small 
fields. 

In this paper we introduce a new method to address the biases in these above estimators. Our estimator is also 
biased, but biased in a particularly convenient way: regardless of the survey geometry or weighting, the shape of 
the correlation function is preserved on average, and only information about a constant offset is lost. This permits 
the straightforward averaging of correlation functions from several small patches across the sky. Building upon the 
estimator in eqn. pi), we develop classes of estimators for both diffuse signals and discrete objects. 



1 If the signal s records the object count in a cell with size 5f2, then (s) = y, = AfSQ. If the cells are so small that they contain at most 
one object, (s\S2) = SP12, making the correspondence between the two definitions clear. 
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This work was prompted by our group's efforts to compute correlation function from observations of the diffuse X-ray 
background. The signal in that case comes from a diffuse, gaseous source, but arrives and is recorded as individual, 
discrete X-ray photons, a nd so can be analy zed with eithe r scheme above. Indee d, for simulations of diffuse X-ray 



emission from the WHIM, Ursino et al. ( 2011| found that the Landy & Szalay ( 1993 ) estimator gave roughly equivalent 
results to an estimator of the type in eqn. p]). We focused on the correlation function biases because the angular 
correlation scale of this gas (several arcminutes) is substantial compared to the field-of-view (~ 8 arcminutes) for 
single-field observations with the Chandra X-ray Observatory. 

The paper is organized as follows. In section [2] we find the bias for the naive estimator (eqn. [3]), verifying our result 
with Monte Carlo simulations, and introduce a method for correcting it up to a constant offset. In section[3]we extend 
this estimate to Poisson-distributed counts, allowing for the possibility of a spatially-varying set of spurious detector 
events. Finally, we summarize our conclusions in section [4] An appendix contains the detailed derivations of the bias 
terms. 

2. CORRELATION FUNCTION ESTIMATOR BIAS 

We begin by defining our signals. Let Si represent a pixelized, diffuse signal that is statistically homogeneous and 
isotropic. Let it be described by a mean and covariance as follows: 

<*i)=M (4) 
(( Si - n)(sj - m)> = (siSj) - (J 2 = C(%) 

where (9^ represents the separation between cells i and j. In our derivations we use C{9) rather that w{9) because the 
examination of biases is convenient; C{9) also makes sense for diffuse fields where /i = 0. No other special properties 
of s are required, except that the covariance matrix is positive semi-definite: < |C(#)| < C(0). In particular, the 
signal need not be a Gaussian random field: we could define higher-order moments without disrupting our following 
arguments. Note that by this definition, the correlation function C(9) is a property of the probability distribution for 
our signal s, and it is not a descriptive statistic. 
With a set of weights on the pixels, cti, we can compute a weighted average to estimate the mean, 

0=£i** (5) 

where the sum is over all pixels. These weights could be chosen to be uniform or to suppress noisy or polluted portions 
of the measurement. Throughout we mark estimated quantities with tildes. This mean estimate is unbiased, (fi) = /i. 
Additionally we define the deviation between the true mean and the estimated mean by 

5fi = ft — n (6) 

with (Sfi) = 0. 

2.1. Naive correlation function estimator 

Based on the estimated mean, we make an initial estimate of the correlation function in a bin labeled by 9 P , which 
we call the naive estimator: 

^ i a \ S« dy(6 , p)a J a i (s l - fx)( Sj - p.) 

This is just a more explicit rewriting of eqn. ([3]). The function 

d ^ ^ _ j 1, if i and j are separated by 9 P ± 69/2 ^ 



WPi ~ 1 o, otherwise 



chooses the separation bin to which the pixel sum contributes^] Evaluation of the estimator costs 0(N 2 ) operations 
over N pixels. If the true mean fj, replaces the estimated mean fi in eqn. ([7]), then this correlation function estimate 

is unbiased^] and we find (Co(9 p )) = C(9 p ). However, since we do not know the true mean, our estimate will be 
biased, because we are forced to use the same (correlated) set of pixels to compute the mean and the correlation 
function. The smaller the s urvey compared to the correlation length of the signal, the worse this bias — the "integral 



constraint" — becomes. (See |Hamilton ( 1993 ) for further discussion of bias due to the mean error and other approaches 
to avoid it.) 

In the appendix, we compute the bias explicitly. We further show that the ensemble average of the naive, biased 
estimator may be cast as a linear operation applied to the true correlation function: 

<<%(0p)>=X> M C(0 9 ), (9) 



2 dij(6) is equivalent to the function defined by Landy Sz Szalayj |l993 I. In practice we loop over all pixels and just select which 
separation bin is appropriate to accumulate the sum. 

3 Technically, biases are also introduced by averaging the smooth sky into pixels — this pixel window function is severe if the pixels 
approach the size of the correlation length — and by binning the smooth correlation function into a stepwise function. These can often be 
made insignificant by choosing finer discretization schemes, and we do not treat such biases here. 
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Fig. 1. — The input correlation function (black) was used to create a set of JVmc = 1000 Monte Carlo realizations of a simulated map 
(without shot noise). At each angular separation, 95 percent of naive estimates C*o($) for the correlation function fall within the pink 
region. The average of the Monte Carlo ensemble of naive estimates is solid blue, and has fluctuations reduced by a factor x/JVmc ~ 30. 
The sum of the bias terms computed from the input C'(9) is shown as the dashed red line. The ensemble average minus the bias terms is 
shown with the dash-dot blue line, and closely matches the input. 

or as a matrix equation, 

(Co) = MC. (10) 



Writing it this way is somewhat analogous to the MASTER technique ( Hivon et al.||2002 | for CMB power spectrum 
estimation on the partial sky. 
In the appendix we find that the matrix is 

M pq = S pq - 2 ^ 3 -j P / n ,_ 3 _ lq + (|AT0 1 



E i7 ■, diji^anotj 



where the auxiliary operations 



n (l) _ J2k a kdik(0 q ) n(2) _ Em aiakd k i(O q ) ™ r™ 



are functions of the pixel weights. This matrix is composed of three terms. The first term is the identity matrix and 
the following two terms are responsible for the bias. The matrix costs 0(N 2 ) operations to compute, the same as the 
naive correlation function estimator. 

2.2. Monte Carlo simulation 

To test our expression for the bias terms, we performed Monte Carlo simulations of continuous, diffuse fields; 
later we will include shot noise. The survey size, roughly 7' x 8', mimics an actual observation with Chandra. For 
the correlation function C(9) in the simulation, we use a Gaussian function with correlation length (i.e. standard 
deviation) of 3.9', significant compared to the size of the field. For weights we use the inverse of the exposure for a 
real set of observations. These downweight the edges of the observations compared to the center (and correspond to 
inverse- variance pixel weights in the Poisson- noise-dominated limit.) 

Figure fl] shows the input correlation, and the ensemble average (and dispersion) of the naive estimates, which are 
biased. Compared to the input correlation, the ensemble average is offset and the shape differs. The bias terms capture 
this difference, but the bias terms depend on the input correlation function, and so when working with data are not 
directly available. We address this shortcoming in the next section. The matrix M for our example is depicted in 
Figure [2j 

In the simulations shown, we generated the diffuse signal s as a Gaussian random fie l d, but obtain the same results 



with a log-normal random field (constructed with the recipe from Carron & Neyrinck (20121 to keep the same mean 
and correlation function) . The ensemble average and bias terms are the same in the Gaussian and non-Gaussian cases, 
however the non-Gaussianities substantially increase the dispersion of the naive estimates. 
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Fig. 2. — Left: the matrix M which relates the true correlation function to the ensemble average of the naive estimate. The columns 
represent the input scale and the rows the output scale. The matrix is dimensionless. Right: Without the identity matrix, we have the 
biasing terms only. 



2.3. Correcting the naive estimator 
Once we have M, we can define a reconstructed correlation junction C(0 q ) as the solution to the linear equation 



M pq C(6 q ), 



(11) 



where the left-hand-side is the naive estimate we already obtained and the right-hand-side contains our reconstruction. 

Unfortunately this equation does not have a unique solution. Explicit computation in the appendix shows that 
M maps any constant offset to zero. Thus constant offsets to the correlation function are in the null space of the 
matrix. In particular this implies that M is not invertible, ruling out a straightforward solution to the linear equation. 
However, we can recover the true C(9) in the ensemble average up to an unknown constant function. 

Since wc know this matrix has a non-empty null space, we analyze it by singular value decomposition, factoring it 

as 



M = UsV J 



(12) 



where U and V are orthogonal and s is diagonal and contains the singular values. The matrix has one singular value 
near zero, and the column of V that corresponds to the singular mode contains the constant function we identified 
previously as being in the null space. 
The upshot of this discussion is that although M does not have an inverse, we can construct a pseudo-inverse 

M+ = Vs+U T (13) 

where s + is a diagonal matrix constructed from the reciprocal of the diagonal of s except at the singular value where 
it is set to zero. Then the reconstructed correlation function 



(14) 



solves equation (111. This solution is not unique, however, since adding any constant function also yields a solution. 
This procedure c hooses the solution which minimizes the squared norm of the reconstructed correlation function (e.g. 
Press et al.||l992[ ) 

Ei^p)i 2 - ( 15 ) 



Therefore, in the ensemble average, we can reconstruct the correlation matrix up to a constant offset factor, as 
shown in Fig. [3] for our Monte Carlo simulation. This shows how the incorrect shape of the ensemble average has been 
repaired in the reconstruction, except for residual fluctuations in the ensemble average. 

Thus we have 

(C(9 P )) = C{8 p ) + const. (16) 
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Fig. 3. — The ensemble of 1000 realizations made with the input correlation shown in black yields the average naive correlation function 
shown in blue. Multiplying the ensemble average by M + , the pseudo-inverse of the biasing matrix, gives the reconstructed correlation 
function (in green), which has the same shape as the input spectrum, but has lost the information about the constant offset. It resembles 
the input spectrum after the input is offset to minimize the square norm. 

where the constant is unknown. Our estimator is therefore biased. Note however, that the shape is not biased, as we 
can see from a comparison of the reconstructed correlation function at two separations: 

(C(6 P ) - C(6 g )) = C[d p ) + const. - C(9 q ) ~ const. = C(9 p ) - C{6 q ) (17) 

for any scales 9 P and 9 q accessible by the survey. Thus we can say that the shape information is preserved in an 
unbiased way. If we further have theoretical expectations or other constraints, these can help fix the offset for the 
correlation function. 

3. POISSON SHOT NOISE 

If the observations have signifi cant shot nois e from measuring discrete photons or objects, additional bias terms 
appear. We use a Poisson model ( |Peeblesl[T980| §33) for our computations. Let iVj be the count of events in pixel or 
cell i. This quantity is Poisson-distributed with a mean parameter \ that is proportional to our diffuse signal. In our 
X-ray example, Xi = srfiA, where Sj is our diffuse signal from before, representing a photon rate per area, time t% is 
the duration of the pixel's exposure, and A is the pixel's collecting arearl Note is a mean number of counts, and 
so is dimensionless. The Chandra observations we have studied have a large fraction of counts (~ 85 percent) that 
are spurious events unrelated to the cosmic signal. We first derive the bias and corrections for the naive estimator 
neglecting these spurious counts, and then including them. 

3.1. No spurious contamination 
If all the counts are genuinely related to the cosmic signal, the observed rate (R) of signal events is 

Ri = Ni/UA (18) 
which has the same units as Sj. The ensemble average of Ri is 

We can estimate the mean of our rate map 

R = ^±° HRi (20) 

which is an unbiased estimate: (R) = /i. The fluctuation in the map's mean we call 

SR^R-fi (21) 

4 These may differ for other applications. For the example of galaxy counts, the galaxy number density plays the role of the signal and 
the cell volume plays the role of the exposure- weighted area. 
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which has (8R) — 0. The covariance of the observed rate map is 



CoviR^Rj) = + C(%). (22) 



This has an additional shot noise component compared to the covariance of the diffuse signal. The shot noise term 
can be avoided if the sums over pixel pairs exclude common pixels, at the cost of slightly more complicated pixel 
accounting. Here we include it in our computations for completeness. 

Note that since C(0) is a property of the diffuse field's probability distribution, in the discrete case it is not subject 
to any particular new constraints compared to the cont inuous case. T he total number of counts (or objects) summed 



over all pixels is a random variable, and is not fixed (Peebles 1980 cf. §31, 33 vs. §32), and there is no specific 
constraint on the integral of C{6). 

The field s, representing a rate of counts or objects, must be non-negative, which implies that its statistics are non- 
Gaussian. For the derivation of the estimator biases, this matters little because, as before, the higher-order moments 
do not appear in our argument. On the other hand, it may matter more when constructing simulations. A Gaussian 
random field can be a suitable approximation for s, but only if the particular realizations do not contain negative 
pixels, which would lead to negative (and thus ill-defined) expected counts. Otherwise, a log-normal random field, 
which is positive-definite and which we employ below, provides another useful candidate. 

As before we make a naive estimate of the correlation function 

* R(n . V^/,^in,o, f ;//, /,')!/,', /,':• 

In the appendix, we show that the ensemble average of the naive estimator for the discrete field can be written as a 
linear function of both the true mean and the true correlation function. 
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where 

R _ gjj rf t # P )cW(i/M)% - 2E\ 1] + E^] 



( 1) = vm E(2) = E^i/t k f mmm 



and M is the same matrix as before. 
We can express this relationship in matrix form as 



(R) \_f 1 (0...0)\ f fi 



) )~\v R M ) { C 



(24) 



where we used that R is an unbiased estimator for fi. 
Like M before, this larger square m atri x is amenable to the construction of a pseudo-inverse by singular value 



decomposition. Analogous to equation (11), we can solve the linear equation 

(25) 



R \ _ ( 1 (0 . . . 0) \( ft 



C$)~\v R M J \ C 

to reconstruct estimates (on the right-hand side) for the mean (this estimate is unbiased because it just takes the 
already unbiased R directly) and correlation function, with the same limitation as before: a constant function added 
to the correlation function is unconstrained. As before, the shape of the reconstructed correlation function in the 
ensemble average matches the true correlation function. 

3.2. With spurious contamination 

In the presence of an uncorrelated, but spatially varying, set of spurious counts, the analysis changes slightly, with 
the spurious counts contributing additional shot noise terms. In the case of Chandra data, these spurious counts are 
well-characterized in the sense that their mean rate is well-understood. However, counts cannot be classified as signal 
or spurious on an individual basis. 

Now our counts include events from both the signal and the spurious set: Ni — + N^ p . Then the ensemble 
average photon count is (iVj) = fiUA + A^ p , where A* p is the known spurious mean count for each pixel. We redefine 
the signal rate map as 
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Fig. 4. — Similar to figure [T] except including shot noise from signal photons and background events, based on 5000 log-normal random 
fields. The Poisson bias terms (dashed green and cyan) are very small except in the first bin, which contains common pixel pairs. Accounting 
for all bias terms, the average closely matches the input, including at the first bin. 

so that (Ri) = fi. Defining the map mean as before yields (R) = n and the fluctuation from the mean has average 
(SR) = 0. From here the analysis proceeds much as before. Noting that 



m - Af )(Ni - Af )) = (W) + Cov(JVj* N° p ) (27) 



we can show that 



which includes an additional shot noise term compared to the similar eqn. ( 22 ) . 

This allows the ensemble average of the naive estimate to be written as the sum of the spurious-event-free naive 
estimate and additional shot-noise terms which depend on the known mean spurious rate, A| p . In the appendix we 
show that this is: 



Subtracti ng aw ay these spurious terms, we can proceed to reconstruct the correlation function as described at the end 
of section 13.11 



3.3. Poisson Monte Carlo simulation 

For a set of 5000 Monte Carlo realizations that include shot noise, we show in Fig. [4] the ensemble average and 
dispersion for the naive estimate, and also the analytic computation of the bias terms. The mean rate of photons, 
[i = 4.3 x 10~ 9 counts/s/pixel, was chosen based on a real Chandra observation, and is low enough that a Gaussian 
random field with this correlation function will have negative pixels. For this reason we used a log-normal random 
field in this case, which accounts for much of the increase in the dispersion compared to Fig. [T] The shot-noise bias 
terms are large in the first bin of the correlation function, which contains the same-pixel pairs. Elsewhere, they are 
small because in this application, we have enough photons to make the shot noise contribution to SR sub-dominant. 
The bias terms we computed account for the shot noise well. The dispersion due to shot noise is extreme at > 9' 
separations for two reasons: only the periphery of the map provides these separations, so there are few pixel pairs, 
and the effective exposure for pixels at the edge of the map is less, so there are many fewer photons than at the center 
of the field. 

In Fig. |5j we demonstrate that the reconstruction of the correlation function by the singular value decomposition 
method works well to correct the shape distortion in the ensemble average. 

4. CONCLUSIONS 

We have developed an estimator for the correlation function which allows the shape, but not the overall offset, of 
the correlation function to be estimated properly in the ensemble average. If there are significant signal correlations 
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Fig. 5. — Similar to figure [3] reconstructing the correlation function, except including shot noise from signal photons and background 
events. 

on the largest scales that the survey region can probe, as with X-ray observations and some other astronomical data 
sets, the large sample variance will limit the utility of the correlation function shape measurement. However, when 
C(6)'s from multiple fields are averaged, we beat down the noise on the shape, while the average of unknown offsets 
simply yields a new unknown offset. Put another way, averaging improves our knowledge of the shape but does not 
worsen our lack of knowledge about the offset. 

The estimators written here, although motivated by observations of the diffuse X-ray background, easily generalize 
to galaxy counts-in-cells (setting = AfiAfl in section[3]). The estimator can be trivially adapted for cross-correlations 
between fields, or extended from angular correlations m two dimensions to linear or time-series correlations in one 
dimension or spatial correlations in three dimensions. 

These estimators may be usefully applied to any situation with correlations on the scale of the observed region. One 
example is the CMB, which in the ACDM model has significant c orrelations even between points on the sky separated 
by 180°. H owever, estimates from the COBE and WMAP data (jllinsliaw et al.||1996| |Spergel et aT|2003| jCopi et al. 



2007 2009) show surprisingly little correlations at scales larger than 60". These authors have used the biased, naive 



estimator (equation |7|) , but our preliminary tests on WMAP maps and the ACDM CMB correlation function indicate 
that the bias terms we have computed here are too small to account for this difference. 

We have computed the variance of our estimates in Monte Carlo simulations, but not analytically, nor have we tried 
to find optimal weights to minimize the variance. When sample variance dominates the covariance for the correlation 
function, it is unlikely that the optimal weighting can be done on a pixel-by-pixel basis, and instead pixel pairs will 
need to be jointly weighted by the inverse covariance for that pair, accounting for t he signal covarianc e an d the signal 
and spuriou s shot noise. Compared to the real-space estimators we examine here, |Efstathiou| ( |2004[ ) and |Efstathiou| 
et al. (2010) argue that a correlation function estimate built from a maximum likelihood estimate of the harmonic 
space power spectrum will have lower variance, because it effectively gives pixel pairs closer-to-optimal weights in this 
way. This task we leave for future work. 
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APPENDIX 

BIAS TERMS: CONTINUOUS CASE 

In this appendix we compute the bias terms for the continuous signal. Rewriting jl — \i + Sjx, the ensemble average 
of the numerator of the naive estimator Q is 

S^j. ia \ \nra..\ u.xr.\ i„.xr,\ i ixr? 



ekj^onaj - ( Si 5fl) - (s 3 S^ + (^ 2 )] (Al) 



where we have used (5jl) = 0. Further we can use the sum's symmetry between i and j to show that it equals 



E 



dijiejetiotj [C(%) - 2{ Si 8n) + (Sfi 2 )] . (A2) 



If we had used the true mean, only the C(9 i j) term would be present, and we could pull it out of the sum as C(9 p ). 

The sum over weights would cancel the denominator, and we would indeed find that (Co(9 p )) = C(9 p ). This is not 
the case here because of the middle and last terms in the brackets, which are responsible for the bias. 

We can compute both bias terms from the field's correlation function. We call the first bias term B^' because it is 
first order in the mean estimation error 5fl, and compute it as 



J2 k a kC{9 ik ) 



The second bias term, B^ 2 \ which is second order in the mean's error, has no dependence on the pixel index. 

B^ = (5f) = ((f,-rf) 

T,k a k S kJ2l a l S l\ „2 



E fc ; aia k {s k si) _ ^ 

(E/c«fc) 2 

Efci a i a kC(9 k i) 

(J2k a kf 



Because B^ does not depend on the pixel index, this term too can slip outside the sum over pixel pairs in eqn. ( A2 ) 
Therefore, finally, we have 

E» <Uj(Op)atiaj 



(C (9 P )) = C(9 P ) - 2 ^^ZZ 3 + B (2) (A3) 



which states the bias in our estimate explicitly. Each bias term costs 0(N 2 ) operations to compute, the same as the 
correlation function. 

Note that our naive estimator has a peculiar reaction to correlation functions such as C(9) = c for all separations 
sampled by our survey)^] In this case (C (9)) = 0, which we show by examining the bias terms. If C{9) — c, then the 
constant can be set outside the sums, which cancel the denominators. Therefore bias factors I?. = c and B^ = c, 



and the middle term of eqn. (A3) is —2c. Therefore (Cq(6)) = c — 2c + c = 0. Thus, if the naive estimator is viewed 
as a linear operator on the input correlation function, constant functions are in the null space of the operator, since 
any constant maps to zero. Moreover, the naive estimator loses the information about any constant baseline in the 
correlation function, although the information about the shape is preserved. 

5 On scales larger than the survey, this correlation function could vary without changing the discussion. 



10 

The bias terms depend on C(9) only on scales accessible by the survey region, and not on any la rger scales. This 
permits an (imperfect) reconstruction of the correlation function. To proceed, we can rewrite eqn. (A3 1 as a matrix 
multiplication: 

{c {e p )) = Y J M m c{e q ) (A4) 

where the sum is over the angular bins. Then we set about finding the matrix M. 
To write down M, we make use of the relationship 

C{e ik ) = Y,d ik {e q )C{6 q ). (A5) 
<? 

Note that this sum is over angular bin, not pixel. We rewrite the bias terms more explicitly as linear operations on 
the vector C(9 q ). The first bias term is 

B W = T, kq <*kdik(0 q )C(9 q ) = ^D^C(6 q ), (A6) 



where we define 



n (i) _ T,k a kdzk(0 q ) 
lq ~ E k a k ■ (A7) 



Note that the first index refers to pixel and the second to bin. The second bias term is 

B (2) = Sfci aiakdki(O q )C(e q ) = d {2) c iq \ 

\Lk a k) q 

where we define 

n ( 2 ) _ Efci aiakdujOg) 

(T,k a k) 

Since C{6 P ) — J2 q S pq C(9 q ), we finally have 



(A8) 



(A9) 



M pq d pq J2 ijdij{ p)aiaj +v « ^ AW > 

To sum up, in this appendix we have: (1) computed the bias of the naive correlation function estimator; (2) shown 
that the ensemble average of the naive estimate is a linear operation acting upon the true correlation; (3) computed 
that linear operator in terms of the pixel weights; and (4) shown that constant offse ts are in the null space of that 
operator. The method to estimate the shape of the correlation function in section 2.3 depends on these results. 

BIAS TERMS: DISCRETE CASE 
No spurious contamination 



To compute the bias for the discrete case, we write the numerator of the naive estimator (23) in terms of the 
fluctuation of the mean 5R and take the ensemble average: 

dij{8)otiOij{Ri — \i — 5R)(Rj — ji — 5Rjj 
= (^2 diM a i a j [(Ri - »)( R i - At) - 2(J2< - H)SR + (SR) 2 ] J 

= Y J d ij {6)a i a j [(/VM)<% + - 2(R t 5R) + ((SR) 2 )] (Bl) 

Now we examine the last two terms, which are analogous to the bias terms for the diffuse signal. First, 

B^ 1) = (R l SR) = Ek ^ {RlRk) -V 2 

_ £ fc a*[(M/M)tfifc +C(9 ik )] 



T,k a k 

■■EPh + bP (B2) 
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where we define 



E 



(i) _ Oj/tjA 



(B3) 



This shows that for a signal of discrete photons, this bias term can be written as a sum of a new shot noise term and 
the old £?W bias term from the diffuse case. 
The final term is 



B R W = ((SR) 2 ) 



J2u a k{RkRi 

T,kt a k^l 

Efci akai[{ti/UA)6 k i 

(E fe ^) 2 

E^^+B^ 



C{6ki)] 



where we define 



E (2) 



E k al/tkA 
(E fe «*) 2 ' 



(B4) 



(B5) 



Again this bias term has a new, shot-noise component added to the old bias term from the diffuse signal. These shot 
noise bias terms cannot be avoided by excluding i — j from the naive estimator's pixel sums. 

Each of the new shot noise terms is proportional to fx. We can ga ther those terms together and notice that the 
remaining terms are just those which appear on the right side of eqn. (A3), so that: 



E - d i3 (9 p )a iai [(l/UA)5 l3 - 2E^ + E^] 
dij{O p )aiaj 



M + <O)(0p)) 



(B6) 



Therefore the ensemble average of the naive estimate for the discrete signal equals the ensemble average of the naive 
estimate for the diffuse signal plus an additional shot noise bias term which is proportional to the mean of the diffuse 
field. 

Thus the ensemble average of the naive estimator for the discrete field can be written as a linear function of the true 
mean and correlation function. 

(C*(6 P )) = v*n + ]T M pq C(6 q ). (B7) 
<? 

This formulation leads to the reconstruction method for the correlation function discussed in section 13.11 

Including spurious contamination 



Starting from equations ( 26 ) and ( 28 1 , w e fin d that the two bias terms also have additional shot noise components 
due to the spurious signal. Instead ofcqn. (|B2[) we have 



B 



R(l) 



(RiSR) =e\ 1) I i + B 



and instead of eqn. (B4) we have 



£?«( 2 ) = ((5R) 2 ) =E (2 ^ + B ( V 



a 2 (E,« fc ) 2 

eld the naive 

E l3 d tl (e)^(Xt P /t 2 A 2 )S v -2( ai Xr/(t 2 A 2 j: k a k )) J2 k a 2 X^/t 2 



(B8) 



(B9) 



Thus there arc additional terms which can be subtracted away to yield the naive estimator in the contamination-free 
case. 



a 2 (Ejk«*r 



(BIO) 



